37 research outputs found

    Fuzzy measures and integrals in re-identification problems

    Get PDF
    In this paper we give an overview of our approach of using aggregation operators, and more specifically, fuzzy integrals for solving re-identification problems. We show that the use of Choquet integrals are suitable for some kind of problems.Postprint (author’s final draft

    Blocking anonymized data

    Get PDF
    Nowadays, privacy is an important issue, for this reason many researchers are working in the development of new data protection methods. The aim of these methods is to minimize the disclosure risk (DR) preserving the data utility. Due to this, the development of better methods to evaluate the DR is an increasing demand. A standard measure to evaluate disclosure risk is record linkage (RL). Normally, when data sets are very large, RL has to split the data sets into blocks to reduce its computational cost. Standard blocking methods need a non protected attribute to build the blocks and, for this reason, they are not a good option when the protected data set is completely masked. In this paper, we propose a new blocking method which does not need a blocking key to build the blocks, and therefore, it is suitable to split fully protected data sets. The method is based on aggregation operators. In particular, in the OWA operator.Peer ReviewedPostprint (author’s final draft

    Modeling projections in microaggregation

    Get PDF
    Microaggregation is a method used by statistical agencies to limit the disclosure of sensitive microdata. It has been proven that microaggregation is an NP-hard problem when more than one variable is microaggregated at the same time. To solve this problem in a heuristic way, a few methods based on projections have been introduced in the literature. The main drawback of such methods is that the projected axis is computed maximizing a statistical property (e.g., the global variance of the data), disregarding the fact that the aim of microaggregation is to keep the disclosure risk as low as possible for all records. In this paper we present some preliminary results on the application of aggregation functions for computing the projected axis. We show that, using the Sugeno integral to calculate the projected axis, we can reduce in some cases the disclosure risk of the protected data (when projected microaggregation is used).Postprint (author’s final draft

    Towards the use of OWA operators for record linkage

    Get PDF
    Record linkage is used to establish links between those records that while belonging to two different files correspond to the same individual. Classical approaches assume that the two files contain some common variables, that are the ones used to link the records. Recently, we introduced a new approach to link records among files when such common variables are not available. In this approach, reidentification is based on the so-called structural information. In this paper we study the use of OWA operators for extracting such structural information and, thus, allowing re-identification.Peer ReviewedPostprint (author’s final draft

    Continuous m-dimensional distorted probabilities

    Get PDF
    Fuzzy measures, also known as non-additive measures, monotonic games, and capacities, have been used in many contexts. For example, in economics, risk analysis, in computer science, computer vision and machine learning and, in general, in mathematics. However, when looking at applications, one of the problems that still needs to be solved is how the measure should be defined in an easy and intuitive way. When the reference set is finite, a few families of measures have been established, e.g. distorted probabilities, k-additive and decomposable measures. But, when the reference set is infinite, the only family is distorted probabilities. In this paper we give a definition for m-dimensional distorted probabilities in the case that the reference set is not finite, and we study some properties of this family. We also give a definition for hierarchically decomposable m-dimensional distorted probabilities that relates to another family of measures defined for the finite case

    Data privacy

    Get PDF
    Data privacy studies methods, tools, and theory to avoid the disclosure of sensitive information. Its origin is in statistics with the goal to ensure the confidentiality of data gathered from census and questionnaires. The topic was latter introduced in computer science and more particularly in data mining, where due to the large amount of data currently available, has attracted the interest of researchers, practitioners, and companies. In this paper we will review the main topics related to data privacy and privacy-enhancing technologies

    Dynamic reputation-based trust computation in private networks

    Get PDF
    Technical Report IIIA-TR-2009-02The use of collaborative networks services in general, and web based social networks (WBSN) services in particular, is today increasing and, therefore, the protection of the resources shared by network participants is becoming a crucial need. In a collaborative network, one of the main parameters on which access control relies is represented by trust and reputation, since access to a resource may or may not be granted on the basis of the trust/reputation of the requesting node. Therefore, the calculation of the trust of the nodes becomes a very important issue, mainly in business to business (BtoB) social networks, where trustworthy nodes can increase their benefits taking profit of their good reputation in the network. In order to address this point, in this paper we propose a mechanism to dynamically compute nodes trust, based on their past behavior. The key characteristic of our proposal is that trust is computed in a private way. This is obtained by anonymizing the local log files storing information about nodes actions.Preprin

    Spherical microaggregation : anonymizing sparse vector spaces

    Get PDF
    Unstructured texts are a very popular data type and still widely unexplored in the privacy preserving data mining field. We consider the problem of providing public information about a set of confidential documents. To that end we have developed a method to protect a Vector Space Model (VSM), to make it public even if the documents it represents are private. This method is inspired by microaggregation, a popular protection method from statistical disclosure control, and adapted to work with sparse and high dimensional data sets

    Blocking anonymized data

    No full text
    Nowadays, privacy is an important issue, for this reason many researchers are working in the development of new data protection methods. The aim of these methods is to minimize the disclosure risk (DR) preserving the data utility. Due to this, the development of better methods to evaluate the DR is an increasing demand. A standard measure to evaluate disclosure risk is record linkage (RL). Normally, when data sets are very large, RL has to split the data sets into blocks to reduce its computational cost. Standard blocking methods need a non protected attribute to build the blocks and, for this reason, they are not a good option when the protected data set is completely masked. In this paper, we propose a new blocking method which does not need a blocking key to build the blocks, and therefore, it is suitable to split fully protected data sets. The method is based on aggregation operators. In particular, in the OWA operator.Peer Reviewe
    corecore